Personalizing a speech synthesizer by voice adaptation
نویسندگان
چکیده
A voice adaptation system enables users to quickly create new voices for a text-to-speech system, allowing for the personalization of the synthesis output. The system adapts to the pitch and spectrum of the target speaker, using a probabilistic, locally linear conversion function based on a Gaussian Mixture Model. Numerical and perceptual evaluations reveal insights into the correlation between adaptation quality and the amount of training data, the number of free parameters. A new joint density estimation algorithm is compared to a previous approach. Numerical errors are studied on the basis of broad phonetic categories. A data augmentation method for training data with incomplete phonetic coverage is investigated and found to maintain high speech quality while partially adapting to the target voice.
منابع مشابه
Text-to-speech voice adaptation from sparse training data
Voice adaptation describes the process of converting the output of a text-to-speech synthesizer voice to sound like a different voice after a training process in which only a small amount of the desired target speaker’s speech is seen. We employ a locally linear conversion function based on Gaussian mixture models to map bark-scaled line spectral frequencies. We compare performance for three di...
متن کاملContinuous Control of the Degree of Articulation in HMM-Based Speech Synthesis
This paper focuses on the implementation of a continuous control of the degree of articulation (hypo/hyperarticulation) in the framework of HMM-based speech synthesis. The adaptation of a neutral speech synthesizer to generate hypo and hyperarticulated speech using a limited amount of speech data is first studied. This is done using inter-speaker voice adaptation techniques, applied here to int...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملHMM-based speech synthesis with various degrees of articulation: A perceptual study
HMM-based speech synthesis is very convenient for creating a synthesizer whose speaker characteristics and speaking styles can be easily modified. This can be obtained by adapting a source speaker’s model to a target speaker’s model, using intra-speaker voice adaptation techniques. In this article, we focus on high-quality HMM-based speech synthesis integrating various degrees of articulation, ...
متن کاملCross-lingual voice conversion-based polyglot speech synthesizer for indian languages
A polyglot speech synthesizer, synthesizes speech for any given monolingual or multilingual text, in a single speaker’s voice. In this regard, a polyglot speech corpus is required. It is difficult to find a speaker proficient in multiple languages. Therefore, in the current work, by exploiting the acoustic similarity of phonemes across Indian languages, a polyglot speech corpus is obtained for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998